1 research outputs found
Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning
Despite recent advances in reinforcement learning (RL), its application in
safety critical domains like autonomous vehicles is still challenging. Although
punishing RL agents for risky situations can help to learn safe policies, it
may also lead to highly conservative behavior. In this paper, we propose a
distributional RL framework in order to learn adaptive policies that can tune
their level of conservativity at run-time based on the desired comfort and
utility. Using a proactive safety verification approach, the proposed framework
can guarantee that actions generated from RL are fail-safe according to the
worst-case assumptions. Concurrently, the policy is encouraged to minimize
safety interference and generate more comfortable behavior. We trained and
evaluated the proposed approach and baseline policies using a high level
simulator with a variety of randomized scenarios including several corner cases
which rarely happen in reality but are very crucial. In light of our
experiments, the behavior of policies learned using distributional RL can be
adaptive at run-time and robust to the environment uncertainty. Quantitatively,
the learned distributional RL agent drives in average 8 seconds faster than the
normal DQN policy and requires 83\% less safety interference compared to the
rule-based policy with slightly increasing the average crossing time. We also
study sensitivity of the learned policy in environments with higher perception
noise and show that our algorithm learns policies that can still drive reliable
when the perception noise is two times higher than the training configuration
for automated merging and crossing at occluded intersections